Insights

Overall admission results.


The overall admission rate is \(35.8\%\).

Admission results by gender.


  • male 44, admit 15 + decline 16 + waitlist 13
  • female 34, admit 12 + decline 12 + waitlist 10

Decision:

Applicants by geolocations.


Originally the plan was to show the applicants’ measures by state. But due to the size of dataset, the result is quite sparse. So hear we decide to devide the states into two partitions: East and West.

Decision:

Grade matters? Does the GPA or standardized testing determine a person’s success?


Decision:

Volunteer level. Or it outweighted by work experience?


Decision:

About

Column

The dataset contains 81 valid admission results from the CSV file SummerStudentAdmissions2.csv.

Three versions of this dataset are included on the right hand side:

  • Standardized data.
  • Cleaned data.
  • Raw data.

Due to the lack of information, some of the variables and contents from the dataset are interpreted intuitively.

In the cleaned dataset,

  • gender=-1 means the gender is undisclosed.
  • volunteer_level is ranked from 5 to 0.
  • gpa is calculated on a 4.0 scale.
  • writing_score should be on a 100 scale.
  • test_score has rather limited information.
  • work_exp’s unit is year.

The dashboard is powered by


The codes are open-sourced. Please feel free to star or fork this repository.

Star Fork

Column

Standardized data

Cleaned data

Raw data

---
title: "Admission Dashboard"
output: 
  flexdashboard::flex_dashboard:
    storyboard: true
    orientation: columns
    vertical_layout: fill
    social: ["twitter", "linkedin"]
    source_code: embed
    theme: bootstrap
    logo: static/logo.png
    favicon: static/favicon.png
    css: style.css
---

```{r setup, include=FALSE}
gc()
rm(list = ls())

if (!require("pacman")) install.packages("pacman")
pacman::p_load(
  tidyverse, flexdashboard,
  here, styler, patchwork,
  hrbrthemes, ggthemes, ggtext, plotly,
  glue, waffle, DT, geofacet, ggbeeswarm
)

dat <- read_csv(here("data/data-cleaned.csv"))

style_file("index.Rmd")

# standardize all numeric variables

dat_stand <- dat |>
  mutate(
    decision = as_factor(decision),
    state = as_factor(state),
    gender = as_factor(gender),
    across(where(is.numeric), ~ round(scale(.)[, 1], 2)),
    partition = if_else(
      state %in% c("California", "Colorado", "Utah", "Oregon"), "west", "east"
    )
  )

dat_stand_long <- dat_stand |> pivot_longer(
  cols = c(
    gpa, work_exp, test_score,
    writing_score, volunteer_level
  ),
  names_to = "variable",
  values_to = "value"
)

dat_long <- dat |>
  mutate(partition = if_else(
    state %in% c("California", "Colorado", "Utah", "Oregon"), "west", "east"
  )) |>
  pivot_longer(
    cols = c(
      gpa, work_exp, test_score,
      writing_score, volunteer_level
    ),
    names_to = "variable",
    values_to = "value"
  )
```

Insights {.storyboard data-icon="fa-chart-line" data-commentary-width=200}
===================================== 


### **Overall admission results.**

```{r, fig.width=8, fig.height=8}
dat |>
  count(decision) -> admission_summary

p1 <- ggplot(admission_summary, aes(fill = decision, values = n)) +
  geom_waffle(color = "white", size = 1.125, n_rows = 9, flip = TRUE) +
  scale_fill_manual(
    values = c("#1A6899", "#FC5449", "#FFCF58"),
    labels = c("Admit", "Decline", "Waitlist")
  ) +
  coord_equal() +
  theme_ipsum_rc() +
  theme(
    legend.position = "bottom",
    legend.title = element_text(color = "#000000"),
    legend.text = element_text(color = "#000000"),
    axis.title.y = element_text(
      family = "IBM Plex Sans", face = "bold"
    ),
    axis.title.x = element_text(
      family = "IBM Plex Sans", face = "bold"
    ),
    axis.text.x = element_blank(),
    axis.text.y = element_blank(),
    text = element_text(
      family = "IBM Plex Sans",
      color = "#3B372E"
    ),
    plot.title = element_text(hjust = 0.5),
    plot.subtitle = element_markdown(hjust = 0.5),
    plot.caption = element_markdown(),
    panel.grid.major = element_blank(),
    panel.grid.minor = element_blank()
  ) +
  labs(
    title = "SUMMER 2022 ADMISSION RESULTS",
    subtitle = "BAD DATA EXCLUDED.",
    caption = glue("SOURCE: SUMMERSTUDENTADMISSION2.CSV"),
    x = "",
    y = ""
  )

p1
```

***

The overall admission rate is $35.8\%$.

### **Admission results by gender.**

```{r, fig.width=12,fig.height=8}
# - male 44, admit 15 + decline 16 + waitlist 13
# - female 34, admit 12 + decline 12 + waitlist 10

dat |>
  select(decision, gender) |>
  mutate(gender = as_factor(gender)) |>
  ggplot(aes(gender)) +
  geom_bar(aes(fill = decision),
    position = position_stack(reverse = TRUE),
    width = 0.2
  ) +
  scale_fill_manual(
    values = c("#1A6899", "#FC5449", "#FFCF58"),
    labels = c("Admit", "Decline", "Waitlist")
  ) +
  scale_x_discrete(labels = c("undisclosed", "female", "male")) +
  theme_ipsum_rc() +
  theme(
    legend.position = "bottom",
    legend.title = element_text(color = "#000000"),
    legend.text = element_text(color = "#000000"),
    axis.title.y = element_text(
      family = "IBM Plex Sans", face = "bold"
    ),
    axis.title.x = element_text(
      family = "IBM Plex Sans", face = "bold"
    ),
    axis.text.x = element_text(),
    axis.text.y = element_text(),
    text = element_text(
      family = "IBM Plex Sans",
      color = "#3B372E"
    ),
    plot.title = element_text(hjust = 0.5),
    plot.subtitle = element_markdown(hjust = 0.5),
    plot.caption = element_markdown(),
    panel.grid.major = element_blank(),
    panel.grid.minor = element_blank()
  ) +
  labs(
    title = "SUMMER 2022 ADMISSION RESULTS",
    subtitle = "BY GENDER",
    caption = glue("SOURCE: SUMMERSTUDENTADMISSION2.CSV"),
    x = "gender",
    y = "count"
  )
```

*** 

- male 44, admit 15 + decline 16 + waitlist 13
- female 34, admit 12 + decline 12 + waitlist 10

**Decision:**


### **Applicants by geolocations.**

```{r, fig.width=12,fig.height=8}
# dat_long |>
#   summary()
p3 <- dat_long |>
  ggplot(aes(x = value, color = partition, fill = partition)) +
  geom_density(alpha = 0.3) +
  facet_wrap(~variable, scales = "free") +
  scale_fill_manual(
    values = c("#FC5449", "#1A6899"),
    labels = c("east", "west")
  ) +
  scale_color_manual(
    values = c("#FC5449", "#1A6899"),
    labels = c("east", "west")
  ) +
  theme_ipsum_rc() +
  theme(
    legend.position = "none",
    legend.title = element_text(color = "#000000"),
    legend.text = element_text(color = "#000000"),
    axis.title.y = element_text(
      family = "IBM Plex Sans", face = "bold"
    ),
    axis.title.x = element_text(
      family = "IBM Plex Sans", face = "bold"
    ),
    # axis.text.x = element_blank(),
    # axis.text.y = element_blank(),
    text = element_text(
      family = "IBM Plex Sans",
      color = "#3B372E"
    ),
    plot.title = element_text(hjust = 0.5),
    plot.subtitle = element_markdown(hjust = 0.5),
    plot.caption = element_markdown(),
    panel.grid.major = element_blank(),
    panel.grid.minor = element_blank()
  ) +
  labs(
    title = "SUMMER 2022 ADMISSION RESULTS",
    subtitle = "**WEST** VS **EAST**",
    caption = glue("SOURCE: SUMMERSTUDENTADMISSION2.CSV"),
    x = "candidates' standardized measurement",
    y = "density"
  )

# ggplotly(p3)
p3
```

*** 

Originally the plan was to show the applicants' measures by state. But due to the size of dataset, the result is quite sparse. So hear we decide to devide the states into two partitions: `East` and `West`.

**Decision:**

 
### **Grade matters?** Does the GPA or standardized testing determine a person's success?


```{r, fig.width=16,fig.height=8}
p4 <- dat |>
  ggplot(aes(x = decision, y = gpa, color = decision)) +
  geom_beeswarm(size = 2) +
  scale_color_manual(
    values = c("#1A6899", "#FC5449", "#FFCF58"),
    labels = c("Admit", "Decline", "Waitlist")
  ) +
  theme_ipsum_rc() +
  theme(
    legend.position = "bottom",
    legend.title = element_text(color = "#000000"),
    legend.text = element_text(color = "#000000"),
    axis.title.y = element_text(
      family = "IBM Plex Sans", face = "bold"
    ),
    axis.title.x = element_text(
      family = "IBM Plex Sans", face = "bold"
    ),
    # axis.text.x = element_text(),
    # axis.text.y = element_text(),
    text = element_text(
      family = "IBM Plex Sans",
      color = "#3B372E"
    ),
    plot.title = element_text(hjust = 0.5),
    plot.subtitle = element_markdown(hjust = 0.5),
    plot.caption = element_markdown(),
    panel.grid.major = element_blank(),
    panel.grid.minor = element_blank()
  ) +
  labs(
    title = "BEESWARM PLOT OF GPA",
    subtitle = "BY DECISION",
    caption = glue("SOURCE: SUMMERSTUDENTADMISSION2.CSV"),
    x = "decision",
    y = "gpa (4.0)"
  )

p5 <- dat |>
  ggplot(aes(x = decision, y = test_score, color = decision)) +
  geom_beeswarm(size = 2) +
  scale_color_manual(
    values = c("#1A6899", "#FC5449", "#FFCF58"),
    labels = c("Admit", "Decline", "Waitlist")
  ) +
  theme_ipsum_rc() +
  theme(
    legend.position = "bottom",
    legend.title = element_text(color = "#000000"),
    legend.text = element_text(color = "#000000"),
    axis.title.y = element_text(
      family = "IBM Plex Sans", face = "bold"
    ),
    axis.title.x = element_text(
      family = "IBM Plex Sans", face = "bold"
    ),
    # axis.text.x = element_text(),
    # axis.text.y = element_text(),
    text = element_text(
      family = "IBM Plex Sans",
      color = "#3B372E"
    ),
    plot.title = element_text(hjust = 0.5),
    plot.subtitle = element_markdown(hjust = 0.5),
    plot.caption = element_markdown(),
    panel.grid.major = element_blank(),
    panel.grid.minor = element_blank()
  ) +
  labs(
    title = "BEESWARM PLOT OF STANDARDIZED TEST SCORE",
    subtitle = "BY DECISION",
    caption = glue("SOURCE: SUMMERSTUDENTADMISSION2.CSV"),
    x = "decision",
    y = "score"
  )

p4 + p5
```

***

**Decision:**

### **Volunteer level.** Or it outweighted by work experience?


```{r, fig.width=12,fig.height=8}
p6 <- dat |>
  ggplot(aes(x = work_exp, y = volunteer_level, color = decision)) +
  geom_beeswarm(size = 2) +
  scale_color_manual(
    values = c("#1A6899", "#FC5449", "#FFCF58"),
    labels = c("Admit", "Decline", "Waitlist")
  ) +
  theme_ipsum_rc() +
  theme(
    legend.position = "bottom",
    legend.title = element_text(color = "#000000"),
    legend.text = element_text(color = "#000000"),
    axis.title.y = element_text(
      family = "IBM Plex Sans", face = "bold"
    ),
    axis.title.x = element_text(
      family = "IBM Plex Sans", face = "bold"
    ),
    # axis.text.x = element_text(),
    # axis.text.y = element_text(),
    text = element_text(
      family = "IBM Plex Sans",
      color = "#3B372E"
    ),
    plot.title = element_text(hjust = 0.5),
    plot.subtitle = element_markdown(hjust = 0.5),
    plot.caption = element_markdown(),
    panel.grid.major = element_blank(),
    panel.grid.minor = element_blank()
  ) +
  labs(
    title = "WORK EXPERIENCE VS VOLUNTEER LEVEL",
    caption = glue("SOURCE: SUMMERSTUDENTADMISSION2.CSV"),
    x = "work experience (years)",
    y = "volunteer level (0-5)"
  )

p6
```

***

**Decision:**


About {data-icon="fa-info" data-orientation=columns}
=====================================

Column {data-width=350}
-----------------------------------------------------------------------

The dataset contains 81 valid admission results from the CSV file `SummerStudentAdmissions2.csv`.

Three versions of this dataset are included on the right hand side:

- Standardized data.
- Cleaned data.
- Raw data.

Due to the lack of information, some of the variables and contents from the dataset are interpreted intuitively.

In the cleaned dataset,

- `gender=-1` means the gender is undisclosed.
- `volunteer_level` is ranked from 5 to 0.
- `gpa` is calculated on a 4.0 scale.
- `writing_score` should be on a 100 scale.
- `test_score` has rather limited information.
- `work_exp`'s unit is year.

***

The dashboard is powered by

- [`flexdashboard`](https://pkgs.rstudio.com/flexdashboard/)
- [`DT`](https://rstudio.github.io/DT/)
- [`plotly`](https://plotly.com/)
- The static visualization theme is customized based on [`hrbrmstr`](https://github.com/hrbrmstr/hrbrthemes).


***

The codes are open-sourced. **Please feel free to star or fork this repository.**

Star Fork

Column {.tabset data-width=650 data-height=1000} ----------------------------------------------------------------------- ### Standardized data ```{r} DT::datatable(dat_stand, options = list( bPaginate = FALSE ), style = "bootstrap" ) |> formatStyle( "decision", backgroundColor = styleEqual( c("Admit", "Decline", "Waitlist"), c("#1A6899", "#FC5449", "#FFCF58") ) ) |> formatStyle(c( "gpa", "work_exp", "test_score", "writing_score", "volunteer_level" ), background = styleColorBar(range(c( dat_stand$gpa, dat_stand$work_exp, dat_stand$test_score, dat_stand$writing_score, dat_stand$volunteer_level )), "lightblue"), backgroundSize = "98% 88%", backgroundRepeat = "no-repeat", backgroundPosition = "center" ) ``` ### Cleaned data ```{r} dat <- read_csv("data/data-cleaned.csv") DT::datatable(dat, options = list( bPaginate = FALSE ), style = "bootstrap" ) |> formatStyle( "decision", backgroundColor = styleEqual( c("Admit", "Decline", "Waitlist"), c("#1A6899", "#FC5449", "#FFCF58") ) ) ``` ### Raw data ```{r} DT::datatable(read_csv("data/SummerStudentAdmissions2.csv"), options = list( bPaginate = FALSE ), style = "bootstrap" ) ```